69 research outputs found
Events and Controversies: Influences of a Shocking News Event on Information Seeking
It has been suggested that online search and retrieval contributes to the
intellectual isolation of users within their preexisting ideologies, where
people's prior views are strengthened and alternative viewpoints are
infrequently encountered. This so-called "filter bubble" phenomenon has been
called out as especially detrimental when it comes to dialog among people on
controversial, emotionally charged topics, such as the labeling of genetically
modified food, the right to bear arms, the death penalty, and online privacy.
We seek to identify and study information-seeking behavior and access to
alternative versus reinforcing viewpoints following shocking, emotional, and
large-scale news events. We choose for a case study to analyze search and
browsing on gun control/rights, a strongly polarizing topic for both citizens
and leaders of the United States. We study the period of time preceding and
following a mass shooting to understand how its occurrence, follow-on
discussions, and debate may have been linked to changes in the patterns of
searching and browsing. We employ information-theoretic measures to quantify
the diversity of Web domains of interest to users and understand the browsing
patterns of users. We use these measures to characterize the influence of news
events on these web search and browsing patterns
Career Transitions and Trajectories: A Case Study in Computing
From artificial intelligence to network security to hardware design, it is
well-known that computing research drives many important technological and
societal advancements. However, less is known about the long-term career paths
of the people behind these innovations. What do their careers reveal about the
evolution of computing research? Which institutions were and are the most
important in this field, and for what reasons? Can insights into computing
career trajectories help predict employer retention?
In this paper we analyze several decades of post-PhD computing careers using
a large new dataset rich with professional information, and propose a versatile
career network model, R^3, that captures temporal career dynamics. With R^3 we
track important organizations in computing research history, analyze career
movement between industry, academia, and government, and build a powerful
predictive model for individual career transitions. Our study, the first of its
kind, is a starting point for understanding computing research careers, and may
inform employer recruitment and retention mechanisms at a time when the demand
for specialized computational expertise far exceeds supply.Comment: To appear in KDD 201
REGAL: Representation Learning-based Graph Alignment
Problems involving multiple networks are prevalent in many scientific and
other domains. In particular, network alignment, or the task of identifying
corresponding nodes in different networks, has applications across the social
and natural sciences. Motivated by recent advancements in node representation
learning for single-graph tasks, we propose REGAL (REpresentation
learning-based Graph ALignment), a framework that leverages the power of
automatically-learned node representations to match nodes across different
graphs. Within REGAL we devise xNetMF, an elegant and principled node embedding
formulation that uniquely generalizes to multi-network problems. Our results
demonstrate the utility and promise of unsupervised representation
learning-based network alignment in terms of both speed and accuracy. REGAL
runs up to 30x faster in the representation learning stage than comparable
methods, outperforms existing network alignment methods by 20 to 30% accuracy
on average, and scales to networks with millions of nodes each.Comment: In Proceedings of the 27th ACM International Conference on
Information and Knowledge Management (CIKM), 201
VoG: Summarizing and Understanding Large Graphs
How can we succinctly describe a million-node graph with a few simple
sentences? How can we measure the "importance" of a set of discovered subgraphs
in a large graph? These are exactly the problems we focus on. Our main ideas
are to construct a "vocabulary" of subgraph-types that often occur in real
graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the
most succinct description of a graph in terms of this vocabulary. We measure
success in a well-founded way by means of the Minimum Description Length (MDL)
principle: a subgraph is included in the summary if it decreases the total
description length of the graph.
Our contributions are three-fold: (a) formulation: we provide a principled
encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop
\method, an efficient method to minimize the description cost, and (c)
applicability: we report experimental results on multi-million-edge real
graphs, including Flickr and the Notre Dame web graph.Comment: SIAM International Conference on Data Mining (SDM) 201
Graph based Anomaly Detection and Description: A Survey
Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field
- …